OFFER: Off-Environment Reinforcement Learning
نویسندگان
چکیده
Policy gradient methods have been widely applied in reinforcement learning. For reasons of safety and cost, learning is often conducted using a simulator. However, learning in simulation does not traditionally utilise the opportunity to improve learning by adjusting certain environment variables – state features that are randomly determined by the environment in a physical setting but controllable in a simulator. Exploiting environment variables is crucial in domains containing significant rare events (SREs), e.g., unusual wind conditions that can crash a helicopter, which are rarely observed under random sampling but have a considerable impact on expected return. We propose off environment reinforcement learning (OFFER), which addresses such cases by simultaneously optimising the policy and a proposal distribution over environment variables. We prove that OFFER converges to a locally optimal policy and show experimentally that it learns better and faster than a policy gradient baseline.
منابع مشابه
Off-Environment RL with Rare Events
Policy gradient methods have been widely applied in reinforcement learning. For reasons of safety and cost, learning is often conducted using a simulator. However, learning in simulation does not traditionally utilise the opportunity to improve learning by adjusting certain environment variables – state features that are randomly determined by the environment in a physical setting but controlla...
متن کاملDevelopment of Reinforcement Learning Algorithm to Study the Capacity Withholding in Electricity Energy Markets
This paper addresses the possibility of capacity withholding by energy producers, who seek to increase the market price and their own profits. The energy market is simulated as an iterative game, where each state game corresponds to an hourly energy auction with uniform pricing mechanism. The producers are modeled as agents that interact with their environment through reinforcement learning (RL...
متن کاملMulticast Routing in Wireless Sensor Networks: A Distributed Reinforcement Learning Approach
Wireless Sensor Networks (WSNs) are consist of independent distributed sensors with storing, processing, sensing and communication capabilities to monitor physical or environmental conditions. There are number of challenges in WSNs because of limitation of battery power, communications, computation and storage space. In the recent years, computational intelligence approaches such as evolutionar...
متن کاملReinforcement learning for energy conservation and comfort in buildings
This paper deals with the issue of achieving comfort in buildings with minimal energy consumption. Specifically a reinforcement learning controller is developed and simulated using the Matlab/Simulink environment. The reinforcement learning signal used is a function of the thermal comfort of the building occupants, the indoor air quality and the energy consumption. This controller is then compa...
متن کاملLocomotion Planning with 3D Character Animations by Combining Reinforcement Learning Based and Fuzzy Motion Planners
Motion and locomotion planning have a wide area of usage in different fields. Locomotion planning with premade character animations has been highly noticed in recent years. Reinforcement Learning presents promising ways to create motion planners using premade character animations. Although RL-based motion planners offer great ways to control character animations but they have some problems that...
متن کامل